Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Request to add error span tag to Opentelemetry specification #86

Closed
wants to merge 1 commit into from

Conversation

vikrambe
Copy link

@vikrambe vikrambe commented Feb 5, 2020

No description provided.

## Motivation

### Why should we make this change?
Tracing backends such as jaeger adhere to opentracing spec [opentracing] (https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table). According to opentracing specification [i.e. also how jaeger has implemented] "error" span tag is a boolean value that helps in filtering spans that are in error state. Since opentelemetry does not have "error" span tag in the specification, error span tags are passed as is to the exported backend. According to opentracing spec error tag is boolean, but when zipkin span is translated to jaeger span format "error" tag continues to be a string carrying complete error message. This violates opentracing specification and also breaks jaeger functionality, as jaeger expects "error" to be boolean and "error.message" to be String type that carries description of the error message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Tracing backends such as jaeger adhere to opentracing spec [opentracing] (https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table). According to opentracing specification [i.e. also how jaeger has implemented] "error" span tag is a boolean value that helps in filtering spans that are in error state. Since opentelemetry does not have "error" span tag in the specification, error span tags are passed as is to the exported backend. According to opentracing spec error tag is boolean, but when zipkin span is translated to jaeger span format "error" tag continues to be a string carrying complete error message. This violates opentracing specification and also breaks jaeger functionality, as jaeger expects "error" to be boolean and "error.message" to be String type that carries description of the error message.
Tracing backends such as jaeger adhere to opentracing spec [opentracing] (https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table). According to opentracing specification [i.e. also how jaeger has implemented] "error" span tag is a boolean value that helps in filtering spans that are in error state. Since opentelemetry does not have "error" span tag in the specification, error span tags are passed as is to the exported backend. According to opentracing spec error tag is boolean, but when zipkin span is translated to jaeger span format "error" tag continues to be a string carrying complete error message. This violates opentracing specification and also breaks jaeger functionality, as jaeger expects "error" to be boolean and "error.message" to be String type that carries description of the error message.

Otherwise this is rendered as a single line of code with a horizontal scroll bar (same below).

Also, please add line breaks, and check the rendered PR in general, there are a few formatting typos.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors are already directly representable in OpenTelemetry data model via Status field which is part of the API [1] and of the data model at the protocol level [2]:

[1] https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#set-status
[2] https://github.com/open-telemetry/opentelemetry-proto/blob/a7a235db6b05a921b06ba1deb6adcbe43343cdb6/opentelemetry/proto/trace/v1/trace.proto#L223

The Status field conveys both the fact that there is an error via Status.code field and the error message via Status.message field.

The rationale for another way of conveying errors is unclear. Is the proposal that OpenTracing's semantic convention around "error" is adopted? That would require re-considering how Status is implemented since co-existence of "error" attribute and Status code results in conflicting information (e.g. what it means if Status.code==0 and "error"=true?)

@Oberon00
Copy link
Member

Oberon00 commented Feb 5, 2020

Status and error are overlapping, but not the same, if I interpret "true if and only if the application considers the operation represented by the Span to have failed" as correct. Best example is HTTP 404 which has a Status of NOT_FOUND, but in many situations is not really an error.

@vikrambe
Copy link
Author

vikrambe commented Feb 5, 2020

Errors are already directly representable in OpenTelemetry data model via Status field which is part of the API [1] and of the data model at the protocol level [2]:

[1] https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#set-status
[2] https://github.com/open-telemetry/opentelemetry-proto/blob/a7a235db6b05a921b06ba1deb6adcbe43343cdb6/opentelemetry/proto/trace/v1/trace.proto#L223

The Status field conveys both the fact that there is an error via Status.code field and the error message via Status.message field.

The rationale for another way of conveying errors is unclear. Is the proposal that OpenTracing's semantic convention around "error" is adopted? That would require re-considering how Status is implemented since co-existence of "error" attribute and Status code results in conflicting information (e.g. what it means if Status.code==0 and "error"=true?)

@tigrannajaryan Jaeger uses Opentracing semantic, do we want to stick with 'Status' ? Is Status field borrowed from opencensus ? We might have to check for Status.code range to classify what falls in error category. Is it safe to assume 'Status.message' not being NIL as 'error=true' ?

@tigrannajaryan
Copy link
Member

I would like to see a better clarification about how Status and error will play together. As of now it appears the Status field in the proto claims to be defining what is an error:

// The Status type defines a logical error model that is suitable for different
// programming environments, including REST APIs and RPC APIs.

If this is wrong then we need to fix this.

@mwear
Copy link
Member

mwear commented Feb 6, 2020

@Oberon00
Copy link
Member

Oberon00 commented Feb 6, 2020

Not to forget the discussions on open-telemetry/opentelemetry-specification#306

Base automatically changed from master to main January 27, 2021 20:37
@bogdandrutu bogdandrutu requested a review from a team January 27, 2021 20:37
@yurishkuro
Copy link
Member

This feels like a candidate for closing. The problem statement is not clear (what does conversion from zipkin to jaeger has to do with OTEL?), and there is already a prescribed mapping of OTEL status into OT error tag in the Jaeger exporter spec.

@bogdandrutu
Copy link
Member

@vikrambe we have re-defined the status to match more with the error tag. I think there is no need for this otep anymore.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants